light field
DELIFFAS: Deformable Light Fields for Fast Avatar Synthesis
Generating controllable and photorealistic digital human avatars is a long-standing and important problem in Vision and Graphics. Recent methods have shown great progress in terms of either photorealism or inference speed while the combination of the two desired properties still remains unsolved. To this end, we propose a novel method, called DELIFFAS, which parameterizes the appearance of the human as a surface light field that is attached to a controllable and deforming human mesh model. At the core, we represent the light field around the human with a deformable two-surface parameterization, which enables fast and accurate inference of the human appearance. This allows perceptual supervision on the full image compared to previous approaches that could only supervise individual pixels or small patches due to their slow runtime. Our carefully designed human representation and supervision strategy leads to state-of-the-art synthesis results and inference time. The video results and code are available at https://vcai.
Spatially Parallel All-optical Neural Networks
Qin, Jianwei, Liu, Yanbing, Liu, Yan, Liu, Xun, Li, Wei, Ye, Fangwei
All-optical neural networks (AONNs) have emerged as a promising paradigm for ultrafast and energy-efficient computation. These networks typically consist of multiple serially connected layers between input and output layers--a configuration we term spatially series AONNs, with deep neural networks (DNNs) being the most prominent examples. However, such series architectures suffer from progressive signal degradation during information propagation and critically require additional nonlinearity designs to model complex relationships effectively. Here we propose a spatially parallel architecture for all-optical neural networks (SP-AONNs). Unlike series architecture that sequentially processes information through consecutively connected optical layers, SP-AONNs divide the input signal into identical copies fed simultaneously into separate optical layers. Through coherent interference between these parallel linear sub-networks, SP-AONNs inherently enable nonlinear computation without relying on active nonlinear components or iterative updates. We implemented a modular 4F optical system for SP-AONNs and evaluated its performance across multiple image classification benchmarks. Experimental results demonstrate that increasing the number of parallel sub-networks consistently enhances accuracy, improves noise robustness, and expands model expressivity. Our findings highlight spatial parallelism as a practical and scalable strategy for advancing the capabilities of optical neural computing.
Spatiotemporally Consistent Indoor Lighting Estimation with Diffusion Priors
Tong, Mutian, Wu, Rundi, Zheng, Changxi
Indoor lighting estimation from a single image or video remains a challenge due to its highly ill-posed nature, especially when the lighting condition of the scene varies spatially and temporally. We propose a method that estimates from an input video a continuous light field describing the spatiotemporally varying lighting of the scene. We leverage 2D diffusion priors for optimizing such light field represented as a MLP. To enable zero-shot generalization to in-the-wild scenes, we fine-tune a pre-trained image diffusion model to predict lighting at multiple locations by jointly inpainting multiple chrome balls as light probes. We evaluate our method on indoor lighting estimation from a single image or video and show superior performance over compared baselines. Most importantly, we highlight results on spatiotemporally consistent lighting estimation from in-the-wild videos, which is rarely demonstrated in previous works.